Thresholding Classifiers to Maximize F1 Score

نویسندگان

Zachary Chase Lipton

Charles Elkan

Balakrishnan Narayanaswamy

چکیده

This paper investigates the properties of the widely-utilized F1 metric as used to evaluate the performance of multi-label classifiers. We show that given an uninformative binary classifier, F1-optimal thresholding is to predict all instances positive. More surprisingly, we prove a relationship between the optimal threshold and the best achievable F1 score over all thresholds. We demonstrate that macroaveraged F1, a commonly used multi-label performance metric, can conceal this extreme thresholding behavior. Finally, based on these properties of F1, we suggest average skill score as an alternative to macro-averaged F1 for multi-label classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Thresholding of Classifiers to Maximize F1 Measure

This paper provides new insight into maximizing F1 measures in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall, the F1 measure is widely used to evaluate the success of a binary classifier when one class is rare. Micro average, macro average, and per instance average F1 measures are used in multilabel classific...

متن کامل

Multi-label Text Categorization with Model Combination based on F1-score Maximization

Text categorization is a fundamental task in natural language processing, and is generally defined as a multi-label categorization problem, where each text document is assigned to one or more categories. We focus on providing good statistical classifiers with a generalization ability for multi-label categorization and present a classifier design method based on model combination and F1-score ma...

متن کامل

Extreme F-measure Maximization using Sparse Probability Estimates

We consider the problem of (macro) F-measure maximization in the context of extreme multi-label classification (XMLC), i.e., multi-label classification with extremely large label spaces. We investigate several approaches based on recent results on the maximization of complex performance measures in binary classification. According to these results, the F-measure can be maximized by properly thr...

متن کامل

Learning Classifiers from Imbalanced, Only Positive and Unlabeled Data Sets

In this report, I presented my results to the tasks of 2008 UC San Diego Data Mining Contest. This contest consists of two classification tasks based on data from scientific experiment. The first task is a binary classification task which is to maximize accuracy of classification on an evenly-distributed test data set, given a fully labeled imbalanced training data set. The second task is also ...

متن کامل

Automatic Assignment of Non-Leaf MeSH Terms to Biomedical Articles

Assigning labels from a hierarchical vocabulary is a well known special case of multi-label classification, often modeled to maximize micro F1-score. However, building accurate binary classifiers for poorly performing labels in the hierarchy can improve both micro and macro F1-scores. In this paper, we propose and evaluate classification strategies involving descendant node instances to build b...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Thresholding Classifiers to Maximize F1 Score

نویسندگان

چکیده

منابع مشابه

Optimal Thresholding of Classifiers to Maximize F1 Measure

Multi-label Text Categorization with Model Combination based on F1-score Maximization

Extreme F-measure Maximization using Sparse Probability Estimates

Learning Classifiers from Imbalanced, Only Positive and Unlabeled Data Sets

Automatic Assignment of Non-Leaf MeSH Terms to Biomedical Articles

عنوان ژورنال:

اشتراک گذاری